- 1. Basics
- How to Compare Arrays in PostgreSQL
- How to Concatenate Strings in PostgreSQL
- How to Convert the Case of a String
- How to Create an Array
- How to Insert Array Data Into a Table
- How to Insert Data Into an Array
- How to Modify Arrays
- How to Query Arrays
- How to Replace Substrings
- How to Trim Strings
- How to Use string_agg()
- How to Use substring()
- How to Use substring() with Regular Expressions
- How to Insert
- How to Update
- How to Delete
- 2. Database Management
- How to Create a Table
- How to Drop a Table
- How to Rename a Table
- How to Truncate a Table
- How to Duplicate a Table
- How to Add a Column
- How to Drop a Column
- How to Rename a Column
- How to Add a Default Value to a Column
- How to Remove a Default Value From a Column
- How to Add a Not Null Constraint
- How to Remove a Not Null Constraint
- How to Create an Index
- How to Drop an Index
- How to Create a View
- How to Drop a View
- How to Alter Sequence
- 3. Dates And Time
- How to Exclude Current or Partial Weeks
- How to Use BETWEEN Correctly
- How to Query Date and Time
- How to Group by Time
- How to Round Timestamps
- How to Convert Local Time to UTC
- 4. Analysis
- How to Use Lateral Joins
- How to Use nullif()
- How to Get the First Row per Group
- How to Use generate_series to Avoid Gaps In Data
- How to Do Type Casting
- How to Write a Common Table Expression
- How to Import a CSV
- How to Compare Two Values When One Is Null
- How to Use Coalesce
- How to Write a Case Statement
- How to Query a JSON Column
- How to Use Filter Clause to Have Multiple Counts
- How to Calculate Cumulative Sum-Running Total
- 1. Basics
- How to Concatenate Strings in MS-SQL Server
- How to Convert the Case of a String
- How to replace a Substring in a String
- How to Trim Strings
- How to Use string_agg()
- How to Use substring()
- How to Insert
- How to Update
- How to Delete
- 2. Database Management
- How to Create a Table in MS-SQL
- How to Drop a Table in MS-SQL
- How to Rename a Table in MS-SQL
- How to Truncate a Table in MS-SQL
- How to Duplicate a Table in MS-SQL
- How to Add a Column in MS-SQL
- How to Drop a Column in MS-SQL
- How to Rename a Column in MS-SQL
- How to Add a Default Value to a Column in MS-SQL
- How to Remove a Default Value From a Column in MS-SQL
- How to Add a Not Null Constraint in MS-SQL
- How to Remove a Not Null Constraint in MS-SQL
- How to Create an Index in MS-SQL
- How to Drop an Index in MS-SQL
- How to Create a View in MS-SQL
- How to Drop a View in MS-SQL
- How to Alter Sequence in MS-SQL
- 3. Dates And Time
- How to Exclude Current or Partial Weeks
- How to Use BETWEEN Correctly
- How to Query Date and Time in MS-SQL
- How to Group by Time
- How to Extract a Component From a Datetime
- 4. Analysis
- How to Use Lateral Joins
- How to Use nullif()
- How to Get the First Row per Group
- How to Do Type Casting in MS-SQL
- How to Write a Common Table Expression
- How to Import a CSV
- How to Use Coalesce
- How to Write a Case Statement
- How to Query a JSON Column
- How to Calculate Cumulative Sum-Running Total
- 1. Basics
- How to Concatenate Strings in MySQL
- How to Convert the Case of a String
- How to replace a Substring in a String
- How to Trim a String
- How to use group_concat()
- How to do you Use a substring()
- How to Use substring() with Regular Expressions
- How to use insert into statement
- How to use Update statement in SQL
- How to use Delete SQL Statement
- 2. Database Management
- How to Create a Table in MySQL
- How to Drop a Table in MySQL
- How to Rename a Table in MySQL
- How to Truncate a Table in MySQL
- How to Duplicate a Table in MySQL
- How to Add a Column in MySQL
- How to Drop a Column in MySQL
- How to Rename a Column in MySQL
- How to Add a Default Value to a Column in MySQL
- How to Remove a Default Value From a Column in MySQL
- How to Add a Not Null Constraint in MySQL
- How to Remove a Not Null Constraint in MySQL
- How to Create an Index in MySQL
- How to Drop an Index in MySQL
- How to Create a View in MySQL
- How to Drop a View in MySQL
- How to Alter Sequence in MySQL
- 3. Dates And Time
- How to Exclude Current or Partial Weeks
- How to Use BETWEEN Correctly
- How to Query Date and Time in MySQL
- How to Group by Time
- How to Round Timestamps
- 4. Analysis
- How to Use nullif()
- How to Get the First Row per Group
- How to Do Type Casting in MySQL
- How to Write a Common Table Expression
- How to Import a CSV
- How to Use Coalesce
- How to Write a Case Statement
- How to Query a JSON Column
- How to Calculate Cumulative Sum-Running Total
- 1. Basics
- How to Concatenate Strings in Oracle
- How to Convert the Case of a String
- How to Replace Substrings
- How to Trim Strings
- How to Use listagg()
- How to Use substring()
- How to Use substring() with Regular Expressions
- How to Insert
- How to Update
- How to Delete
- 2. Database Management
- How to Create a Table
- How to Drop a Table
- How to Rename a Table
- How to Truncate a Table
- How to Duplicate a Table
- How to Add a Column
- How to Drop a Column
- How to Rename a Column
- How to Add a Default Value to a Column
- How to Remove a Default Value From a Column
- How to Add a Not Null Constraint
- How to Remove a Not Null Constraint
- How to Create an Index
- How to Drop an Index
- How to Create a View
- How to Drop a View
- How to Alter Sequence
- 3. Dates And Time
- How to Exclude Current or Partial Weeks
- How to Use BETWEEN Correctly
- How to Query Date and Time
- How to Group by Time
- How to Round Timestamps
- 4. Analysis
- How to Use nullif()
- How to Get the First Row per Group
- How to Do Type Casting
- How to Write a Common Table Expression
- How to Import a CSV
- How to Use Coalesce
- How to Write a Case Statement
- How to Query a JSON Column
- How to Use Filter Clause to Have Multiple Counts
- How to Calculate Cumulative Sum-Running Total
- 1. Basics
- How to Concatenate Strings in BigQuery
- How to Convert the Case of a String
- How to replace a Substring in a String
- How to Trim a String
- How to Use string_agg()
- How to use substring() function
- How to Use substring() with Regular Expressions
- How to Use BETWEEN Correctly
- How to use insert into statement
- How to use Update statement in SQL
- How to use Delete SQL Statement
- 2. Database Management
- How to Create a Database in BigQuery
- How to Create a Table in BigQuery
- How to Drop a Table in BigQuery
- How to Rename a Table in BigQuery
- How to Truncate Table in BigQuery
- How to Duplicate a Table in BigQuery
- How to Add a Column in BigQuery
- How to Drop a Column in BigQuery
- How to Add a Default Value to a Column in BigQuery
- How to Add a Not Null Constraint in BigQuery
- How to Remove a Not Null Constraint in BigQuery
- How to Create a View in BigQuery
- How to Drop a View in BigQuery
- 3. Dates And Time
- How to Exclude Current or Partial Weeks
- How to Query Date and Time
- How to Group by Time
- How to Round Timestamps
- 4. Analysis
- How to Use nullif()
- How to Get the First Row per Group
- How to Do Type Casting
- How to Write a Common Table Expression
- How to Compare Two Values When One Is Null
- How to Use Coalesce
- How to Write a Case Statement
- How to Calculate Cumulative Sum-Running Total
SQL GROUP BY Clause in a SELECT Query
- 1. Basics
- How to Compare Arrays in PostgreSQL
- How to Concatenate Strings in PostgreSQL
- How to Convert the Case of a String
- How to Create an Array
- How to Insert Array Data Into a Table
- How to Insert Data Into an Array
- How to Modify Arrays
- How to Query Arrays
- How to Replace Substrings
- How to Trim Strings
- How to Use string_agg()
- How to Use substring()
- How to Use substring() with Regular Expressions
- How to Insert
- How to Update
- How to Delete
- 2. Database Management
- How to Create a Table
- How to Drop a Table
- How to Rename a Table
- How to Truncate a Table
- How to Duplicate a Table
- How to Add a Column
- How to Drop a Column
- How to Rename a Column
- How to Add a Default Value to a Column
- How to Remove a Default Value From a Column
- How to Add a Not Null Constraint
- How to Remove a Not Null Constraint
- How to Create an Index
- How to Drop an Index
- How to Create a View
- How to Drop a View
- How to Alter Sequence
- 3. Dates And Time
- How to Exclude Current or Partial Weeks
- How to Use BETWEEN Correctly
- How to Query Date and Time
- How to Group by Time
- How to Round Timestamps
- How to Convert Local Time to UTC
- 4. Analysis
- How to Use Lateral Joins
- How to Use nullif()
- How to Get the First Row per Group
- How to Use generate_series to Avoid Gaps In Data
- How to Do Type Casting
- How to Write a Common Table Expression
- How to Import a CSV
- How to Compare Two Values When One Is Null
- How to Use Coalesce
- How to Write a Case Statement
- How to Query a JSON Column
- How to Use Filter Clause to Have Multiple Counts
- How to Calculate Cumulative Sum-Running Total
- 1. Basics
- How to Concatenate Strings in MS-SQL Server
- How to Convert the Case of a String
- How to replace a Substring in a String
- How to Trim Strings
- How to Use string_agg()
- How to Use substring()
- How to Insert
- How to Update
- How to Delete
- 2. Database Management
- How to Create a Table in MS-SQL
- How to Drop a Table in MS-SQL
- How to Rename a Table in MS-SQL
- How to Truncate a Table in MS-SQL
- How to Duplicate a Table in MS-SQL
- How to Add a Column in MS-SQL
- How to Drop a Column in MS-SQL
- How to Rename a Column in MS-SQL
- How to Add a Default Value to a Column in MS-SQL
- How to Remove a Default Value From a Column in MS-SQL
- How to Add a Not Null Constraint in MS-SQL
- How to Remove a Not Null Constraint in MS-SQL
- How to Create an Index in MS-SQL
- How to Drop an Index in MS-SQL
- How to Create a View in MS-SQL
- How to Drop a View in MS-SQL
- How to Alter Sequence in MS-SQL
- 3. Dates And Time
- How to Exclude Current or Partial Weeks
- How to Use BETWEEN Correctly
- How to Query Date and Time in MS-SQL
- How to Group by Time
- How to Extract a Component From a Datetime
- 4. Analysis
- How to Use Lateral Joins
- How to Use nullif()
- How to Get the First Row per Group
- How to Do Type Casting in MS-SQL
- How to Write a Common Table Expression
- How to Import a CSV
- How to Use Coalesce
- How to Write a Case Statement
- How to Query a JSON Column
- How to Calculate Cumulative Sum-Running Total
- 1. Basics
- How to Concatenate Strings in MySQL
- How to Convert the Case of a String
- How to replace a Substring in a String
- How to Trim a String
- How to use group_concat()
- How to do you Use a substring()
- How to Use substring() with Regular Expressions
- How to use insert into statement
- How to use Update statement in SQL
- How to use Delete SQL Statement
- 2. Database Management
- How to Create a Table in MySQL
- How to Drop a Table in MySQL
- How to Rename a Table in MySQL
- How to Truncate a Table in MySQL
- How to Duplicate a Table in MySQL
- How to Add a Column in MySQL
- How to Drop a Column in MySQL
- How to Rename a Column in MySQL
- How to Add a Default Value to a Column in MySQL
- How to Remove a Default Value From a Column in MySQL
- How to Add a Not Null Constraint in MySQL
- How to Remove a Not Null Constraint in MySQL
- How to Create an Index in MySQL
- How to Drop an Index in MySQL
- How to Create a View in MySQL
- How to Drop a View in MySQL
- How to Alter Sequence in MySQL
- 3. Dates And Time
- How to Exclude Current or Partial Weeks
- How to Use BETWEEN Correctly
- How to Query Date and Time in MySQL
- How to Group by Time
- How to Round Timestamps
- 4. Analysis
- How to Use nullif()
- How to Get the First Row per Group
- How to Do Type Casting in MySQL
- How to Write a Common Table Expression
- How to Import a CSV
- How to Use Coalesce
- How to Write a Case Statement
- How to Query a JSON Column
- How to Calculate Cumulative Sum-Running Total
- 1. Basics
- How to Concatenate Strings in Oracle
- How to Convert the Case of a String
- How to Replace Substrings
- How to Trim Strings
- How to Use listagg()
- How to Use substring()
- How to Use substring() with Regular Expressions
- How to Insert
- How to Update
- How to Delete
- 2. Database Management
- How to Create a Table
- How to Drop a Table
- How to Rename a Table
- How to Truncate a Table
- How to Duplicate a Table
- How to Add a Column
- How to Drop a Column
- How to Rename a Column
- How to Add a Default Value to a Column
- How to Remove a Default Value From a Column
- How to Add a Not Null Constraint
- How to Remove a Not Null Constraint
- How to Create an Index
- How to Drop an Index
- How to Create a View
- How to Drop a View
- How to Alter Sequence
- 3. Dates And Time
- How to Exclude Current or Partial Weeks
- How to Use BETWEEN Correctly
- How to Query Date and Time
- How to Group by Time
- How to Round Timestamps
- 4. Analysis
- How to Use nullif()
- How to Get the First Row per Group
- How to Do Type Casting
- How to Write a Common Table Expression
- How to Import a CSV
- How to Use Coalesce
- How to Write a Case Statement
- How to Query a JSON Column
- How to Use Filter Clause to Have Multiple Counts
- How to Calculate Cumulative Sum-Running Total
- 1. Basics
- How to Concatenate Strings in BigQuery
- How to Convert the Case of a String
- How to replace a Substring in a String
- How to Trim a String
- How to Use string_agg()
- How to use substring() function
- How to Use substring() with Regular Expressions
- How to Use BETWEEN Correctly
- How to use insert into statement
- How to use Update statement in SQL
- How to use Delete SQL Statement
- 2. Database Management
- How to Create a Database in BigQuery
- How to Create a Table in BigQuery
- How to Drop a Table in BigQuery
- How to Rename a Table in BigQuery
- How to Truncate Table in BigQuery
- How to Duplicate a Table in BigQuery
- How to Add a Column in BigQuery
- How to Drop a Column in BigQuery
- How to Add a Default Value to a Column in BigQuery
- How to Add a Not Null Constraint in BigQuery
- How to Remove a Not Null Constraint in BigQuery
- How to Create a View in BigQuery
- How to Drop a View in BigQuery
- 3. Dates And Time
- How to Exclude Current or Partial Weeks
- How to Query Date and Time
- How to Group by Time
- How to Round Timestamps
- 4. Analysis
- How to Use nullif()
- How to Get the First Row per Group
- How to Do Type Casting
- How to Write a Common Table Expression
- How to Compare Two Values When One Is Null
- How to Use Coalesce
- How to Write a Case Statement
- How to Calculate Cumulative Sum-Running Total
So far the query results are record-oriented, meaning each row in the result corresponds to one record in the table. The SQL GROUP BY
clause opens up the possibility that queries have a row in the result representing a group of records. Plus, it enables you to use grouping functions: functions that return some result () calculated in the group of records such as the sum, average or quantity.
Here is the example table. Suppose you work on a farm growing apples. The database contains a single table named apple with all the production history. The table has the following schema:
Year | Apple Variety | Number of Trees in Production | Tons Produced | Harvest day | Price per Ton | First summer storm |
---|---|---|---|---|---|---|
2020 | Red Delicious | 2000 | 102 | 06/23/2020 | 54.50 | 07/03/2020 |
2020 | Magic Green | 700 | 33 | 06/12/2020 | 62.60 | 07/03/2020 |
2020 | Red Globus | 500 | 26 | 05/30/2020 | 71.50 | 07/03/2020 |
2019 | Red Delicious | 1800 | 87 | 07/15/2019 | 52.25 | 07/12/2019 |
2019 | Magic Green | 500 | 26 | 06/28/2019 | 59.40 | 07/12/2019 |
2019 | Red Globus | 500 | 27 | 05/28/2019 | 68.00 | 07/12/2019 |
2018 | Red Delicious | 1800 | 92 | 07/02/2018 | 56.75 | 06/03/2018 |
2018 | Red Globus | 500 | 24 | 05/30/2018 | 66. | 06/03/2018 |
2017 | Red Delicious | 1500 | 76.5 | 07/18/2017 | 51.45 | 07/30/2017 |
2016 | Red Delicious | 1500 | 72 | 06/26/2016 | 47.60 | 06/23/2016 |
Suppose you want to obtain the total of apples produced every year. This value is not a column table apple
, but there is a tons_produced
column per apple variety. If you group all the records for the same year and sum the value in tons_produced
for all the records in the group, you obtain the total. Here is the GROUP BY
clause you would use:
SELECT year,
SUM(tons_produced) AS Total_apple_produced
FROM apple
GROUP BY year
This query creates groups of records with the same value in the column year. The first group is for year 2020 with 3 records, the next group is 2019 with 3 records, and so on. For each group of records you apply the Aggregate Functions SUM
to all the records in the group to obtain the total tons produced each year. Here are the results of the query:
Year | Total_apple_produced |
---|---|
2020 | 161 |
2019 | 140 |
2018 | 116 |
2017 | 76.5 |
2016 | 72 |
Using SQL WHERE and SQL GROUP BY in a SELECT
In this lesson, you combine all the concepts or clauses you have learned into a single query. You use a WHERE clause to filter records and a GROUP BY to group records in the same SELECT statement. If you want to review the WHERE clause, jump back to the lesson Using The SQL WHERE Clause With Comparison Operators .
The WHERE
clause is applied first to filter the records. The GROUP BY
clause is applied to the filtered records. Here is the query:
SELECT year,
AVG(price_per_ton) AS avg_price_ton
FROM apples
WHERE EXTRACT(MONTH FROM first_summer_storm) = 7
GROUP BY year
The query uses the EXTRACT
function to obtain the month from the first_summer_storm
to filter the records from July. Once the records are filtered, you create the groups of records with the GROUP BY
year, meaning all records from the same year are in the same group. After that, the average of the price_per_ton
is calculated. Here are the results of the query:
Year | avg_price_ton |
---|---|
2020 | 62.86 |
2019 | 59.88 |
2017 | 51.45 |
The title of the column with the average price was renamed to avg_price_ton
by the AS clause in the SELECT
. This is a common practice when you use aggregate functions like AVG
, SUM
or COUNT
. You are creating a calculated value that is not coming from any column. Using the AS clause assigns the name/description in the result, as the database does not know what column name to use.
SQL GROUP BY Collapse of Records Effect
This is an important point a SQL
developer must understand to avoid a common error when using the GROUP BY
clause. After the database creates the groups of records, all the records are collapsed into groups. You can no longer refer to any individual record column in the query. In the SELECT
list, you can only refer to columns that appear in the GROUP BY
clause. The columns appearing in the group are valid because they have the same value for all the records in the group.
Elements allowed to be in the SELECT list |
---|
COLUMNS used in the GROUP BY |
Aggrupation Functions (SUM, COUNT, AVG, MAX, MIN) |
Constants (text, numeric or date value) |
Here is an example of an invalid query:
SELECT year,
first_summer_storm, -- INVALID COLUMN
AVG(price_per_ton) AS avg_price_ton
FROM apples
WHERE EXTRACT(MONTH FROM first_summer_storm) = 7
GROUP BY year
Below is the error returned by the database:
ERROR: The column «apple.first_summer_storm» must be in GROUP BY
LINE 2: first_summer_storm, – INVALID COLUMN
As a reader exercise, find out why the column is invalid.
SQL Aggregate Functions With Expressions
So far you used aggregate functions with table columns as a single parameter. Now it is time to use the functions with expressions based on several table columns.
For a more complex query, suppose you want to obtain a metric called dollar per tree. here is the math calculation:
dollar_per_tree = ( tons_produced * price_per_ton ) / number_of_trees
The dollar per tree metric is a value related to each apple variety. For example, one apple variety can produce many tons, but usually has a low market price, while other varieties produce less in terms of tons, but have a higher price. The dollar per tree metric can help you figure out the relative value of the trees.
For this example, you want to obtain the average of dollars per tree for each apple variety, plus the maximum and minimum value for the metric. You need to group by apple variety not by year. Here is the query:
SELECT apple_variety,
AVG((tons_produced * price_per_ton)/number_of_trees) AS avg_dollar_per_tree,
MAX((tons_produced * price_per_ton)/number_of_trees) AS max_dollar_per_tree,
MIN((tons_produced * price_per_ton)/number_of_trees) AS min_dollar_per_tree
FROM apple
GROUP BY apple_variety
The result:
apple_variety | avg_dollar_per_tree | max_dollar_per_tree | min_dollar_per_tree |
---|---|---|---|
Red Delicious | 2.62 | 2.90 | 2.28 |
Magic Green | 3.01 | 3.08 | 2.95 |
Red Globus | 3.51 | 3.71 | 3.16 |
Of course the dollar per tree metric is related to the year, as tons_produced
depends on weather conditions. To see the metric for each year and each apple variety, use a GROUP BY
multi-columns:
SELECT year,
apple_variety,
AVG((tons_produced * price_per_ton)/number_of_trees) AS avg_dollar_per_tree,
MAX((tons_produced * price_per_ton)/number_of_trees) AS max_dollar_per_tree,
MIN((tons_produced * price_per_ton)/number_of_trees) AS min_dollar_per_tree
FROM apple
GROUP BY year, apple_variety
The result:
year | apple_variety | avg_dollar_per_tree | max_dollar_per_tree | min_dollar_per_tree |
---|---|---|---|---|
2020 | Red Delicious | 2.77 | 2.77 | 2.77 |
2020 | Magic Green | 2.95 | 2.95 | 2.95 |
2020 | Red Globus | 3.71 | 3.71 | 3.71 |
2019 | Red Delicious | 2.52 | 2.52 | 2.52 |
2019 | Magic Green | 3.08 | 3.08 | 3.08 |
2019 | Red Globus | 3.67 | 3.67 | 3.67 |
2018 | Red Delicious | 2.90 | 2.90 | 2.90 |
2018 | Red Globus | 3.16 | 3.16 | 3.16 |
2017 | Red Delicious | 2.62 | 2.62 | 2.62 |
2016 | Red Delicious | 2.28 | 2.28 | 2.28 |
Looking at this result, notice several columns have the same values for AVG()
, MAX()
and MIN()
results. This is due to the groups created by the GROUP BY
having one record, resulting in the same average, maximum and minimum values.
Closing Words
In this lesson you learned to use the SQL GROUP BY
and aggregate functions to increase the power expressivity of the SQL SELECT statement. You know about the collapse issue, and understand you cannot reference individual records once the GROUP BY
clause is used.
If you need to access the individual record, SQL provides WINDOW FUNCTIONS
as you will learn in a later lesson. For example, if you want a report to show how much is the difference between current year price_per_ton
and average of price_ton
, you clearly need to access the individual record to obtain the current year price and the average price, you can use the SQL WINDOW FUNCTIONS.
The next lesson goes into more depth on the aggregate functions. You have already used some in this lesson such as AVG()
, SUM()
, MIN()
and MAX()
. Keep going, learn SQL
and increase your skills!
IN THIS PAGE